Convergent Cross Mapping (CCM)
School on Physics Applications in Biology, January 2018
This tutorial presents the general idea of Convergent Cross Mapping (CCM, Ye et al. 2017), then it shows some practical examples using syntetic data, and lastly we propose a real data analysis as exercise. For these activities, you will need the most recent version of R and the rEDM package installed in your working computer.
After running this tutorial, you can review the concepts of cross-mapping and learn details about the rEDM library in the introductory documentation of the package (vignette) which you can find here or in your local R installation with the command
From Chaos to Chaos-ality
General idea 1
One of the corollaries to the Generalized Takens’ Theorem is that it should be possible to cross predict or cross map between variables that are observed from the same dynamical system. Consider two variables, X and Y, that interact in a dynamic system. Then the univariate reconstructions based on X alone should uniquely identify the system state and thus the corresponding value of Y, (and vice versa).
Syntetic Data
First, let’s simulate a deterministic discrete-time dynamics with chaotic behavior. To create a 150-step time series following this dynamics, run the commands below in the R console:
## Two vectors to store data
X <- c()
Y <- c()
## Initial values
X[1] <- 0.1
Y[1] <- 0.1
## Iterate the dynamics 150 time steps
for(i in 2:150){
X[i] <- 3.77*X[i-1]*(1-X[i-1])
Y[i] <- 3.82*Y[i-1]*(1-Y[i-1]-0.05*X[i-1])
}
XY<-as.data.frame(cbind(X,Y))Note that the variable X is causing changes in Y. However, if we plot the X and Y time series of this chaotic system, they do not seem to be related:
X and Y do not seem to be related, although we know that X and Y are coupled (it’s a syntetic data!). Moreover, we can calculate the correlation coeficient between X and Y:
The results show that X and Y are not correlated, even though the data comes from a model with known causality (i.e. X is causing changes in Y). This is a clear example that “lack of correlation does not imply lack of causation”.
Cross Mapping2
How can we then extract the causality of \(X\) in \(Y\) from their dynamics? As we have seen in previous sections, a generic property of the reconstructed shadow manifold is that the states of \(X(t)\) on the shadow manifold (\(M_X\)) maps one-to-one to states on the original attractor manifold \(M\), and local neighborhoods on \(M_X\) map to local neighborhoods on \(M\).
It follows that for two variables \(X\) and \(Y\) that are dynamically coupled, local neighborhoods on their lagged reconstructions (\(M_X\) and \(M_Y\), respectively) will map to each other since \(X\) and \(Y\) are essentially alternative observations of the common original attractor manifold \(M\).
Convergent cross mapping (CCM) determines how well local neighborhoods on \(M_X\) correspond to local neighborhoods on \(M_y\) and vice versa. To do so, a manifold \(M_X\) is constructed from lags of the variable \(X\) and used to estimate the states of \(Y\) at the same time points. Similarly, a manifold \(M_y\) is constructed from lags of variable \(Y\) and used to estimate the states of \(X\) at same times.
To do so, we first need to obtain the optimal embedding dimension for both variables using the simplex function (see tutorial XXX if you have doubts about optimal embedding).
options(warn = -1)
simplex_X<-simplex(X,silent=T)
plot(simplex_X$rho,type='o', ylab = "Forecast Skill (rho)", xlab="Embedding Dimension (E)")
E_star_X<-which.max(simplex_X$rho)
print(paste('E*(X) =',E_star_X))## [1] "E*(X) = 2"
simplex_Y<-simplex(Y,silent=T)
plot(simplex_Y$rho,type='o', ylab = "Forecast Skill (rho)", xlab="Embedding Dimension (E)")
E_star_Y<-which.max(simplex_Y$rho)
print(paste('E*(Y) =',E_star_Y))## [1] "E*(Y) = 2"
The optimal embedding (E*) dimensions for X and Y are equal to two.
Constructing the manifolds \(M_X\) and My
Now that we have the optimal embeddings, we can construct the shadow manifolds \(M_X\) and \(M_Y\) using the make_block function from the rEDM package. Use the command below to load this function into your R workspace (it will be included in rEDM itself in future releases of the package):
Now let’s create the shadow manifold for both X and Y. Since E* = 2, we will use two dimensions to construct the shadow manifolds \(M_X\) and My. This means that \(M_X\) will be build based on X(t) and X(t+1), similarly \(M_Y\) is created from Y(t) and Y(t+1).
Shadow_MXY<-make_block(XY,max_lag = 2) # max_lag is the optimal embedding dimension
Shadow_MX<-Shadow_MXY[,2:3]
Shadow_MY<-Shadow_MXY[,4:5]
head(Shadow_MXY)## time X X_1 Y Y_1
## 1 1 0.1000000 NA 0.1000000 NA
## 2 2 0.3393000 0.1000000 0.3418900 0.1000000
## 3 3 0.8451417 0.3393000 0.8373481 0.3418900
## 4 4 0.4934071 0.8451417 0.3851034 0.8373481
## 5 5 0.9423361 0.4934071 0.8682788 0.3851034
## 6 6 0.2048571 0.9423361 0.2806179 0.8682788
The table above shows the X and Y time series and their respective lags X_1 and Y_1. The manifold \(M_X\) is thus composed of X and X_1, and \(M_Y\) of Y and Y_1.
Cross-mapping from \(M_X\) to \(M_Y\) (X_xmap_Y)
To better understand the cross mapping, lets start by using X to predict one single value in Y. Here, we are using the term “prediction”, but instead of predicting future values of X (as in the simplex tutorial LINK), we will predict values of Y(t) using X(t) and vice versa. This “cross-prediction” is performed between different variables but for the same point in time.
Suppose we want to predict the value y(t=70), using the index of a single predictor:
## [1] 0.8606044
We are going to use the nearest neighbours of x(70) within the manifold \(M_X\) to predict y(70). To do this, we first create a matrix of distances among the states of \(M_X\) and then we select the E*+1 (2+1= 3) nearest neighbours within the \(M_X\).
dist.matrix_X <- as.matrix(dist(Shadow_MX, upper=TRUE))
neigb_X <- order(dist.matrix_X[predictor,])[2:4]
neigh_X_print<-c(neigb_X)
print(paste('simplex_Mx:',list(neigh_X_print)))## [1] "simplex_Mx: c(31, 93, 130)"
We can also plot \(M_X\) with the predictor (red dot) and the respective neighbours simplex_Mx (blue dots):
The cross-mapping process starts by finding (mapping) the simplex_Mx into My, creating the simplex_My. Note that simplex_My has the same indexes as simplex_Mx. The figure below shows \(M_Y\) and simplex_My (green dots).
The simplex_My will then be used to estimate the contemporaneous value of the predictor y(70) in My, obtaining the predicted value Y(tilda). [MARINA: nao rola colocar um plot aqui que nem o do simplex, tipo com a estimativa de y(70) e o valor real? Pq o proximo passo abaixo ja eh expandir isso para todo o Y. Ou pelo menos so o valor do predicted y(70) e observed y(70)]
Now we can expand the idea of predicting a single state of Y based on X, to predicting all the states of Y using X. If we then do this prediction for every single value of Y, we will have a vector of the predicted states of Y (using X) and the real observed states of Y. We can then compare the predicted and observed:
lib<-lib <- c(1, NROW(Shadow_MXY))
block_lnlp_output_XY <- block_lnlp(Shadow_MXY, lib = lib, pred = lib, columns = c("X",
"X_1"), target_column = "Y", stats_only = FALSE, first_column_time = TRUE)
observed_all_Y <- block_lnlp_output_XY$model_output[[1]]$obs
predicted_all_Y <- block_lnlp_output_XY$model_output[[1]]$pred
pred_obs_Y<-as.data.frame(cbind(predicted_all_Y,observed_all_Y))
colnames(pred_obs_Y)<-c('Predicted Y','Observed Y')
head(pred_obs_Y)## Predicted Y Observed Y
## 1 0.4975515 0.8373481
## 2 0.6043905 0.3851034
## 3 0.7493451 0.8682788
## 4 0.5365754 0.2806179
## 5 0.6601371 0.7601691
## 6 0.5649814 0.6072697
Below is the plot for all predictions of Y (mapped from X: X_xmap_Y) against the real observations, plus the Pearson’s correlation coefficient. The red dot represents the predictor (t=70) used in the example above.
Cross-mapping from \(M_Y\) to \(M_X\) (Y_xmap_X)
Similarly to the previous mapping, we can now do the opposite: use Y to predict values of X. We obtain the indexes of the E+1 nearest neighbors from the given predictor. Since we are now cross mapping from \(M_Y\) to \(M_X\), the simplex_My is generated directly from the shadow manifold My. Again, we first create a matrix of distances among the states of \(M_Y\) and then we select the E+1 nearest neighbors.
dist.matrix_Y <- as.matrix(dist(Shadow_MY, upper=TRUE))
neigb_Y <- order(dist.matrix_Y[predictor,])[2:4]
neigh_Y_print<-c(neigb_Y)
print(paste('simplex_My:',list(neigh_Y_print)))## [1] "simplex_My: c(135, 5, 54)"
The following plot presents \(M_Y\) with the predictor (red dot) and respective simplex_My (blue dots).
Next, we map the simplex_My in \(M_X\), creating the simplex_Mx. Analogously, simplex_Mx has the same indexes as simplex_My. The figure below shows \(M_X\) and simplex_Mx (green dots).
The simplex_Mx will then be used to estimate the contemporaneous value of the predictor in \(M_X\), obtaining the predicted value X(tilda).
lib<-lib <- c(1, NROW(Shadow_MXY))
block_lnlp_output_YX <- block_lnlp(Shadow_MXY, lib = lib, pred = lib, columns = c("Y",
"Y_1"), target_column = "X", stats_only = FALSE, first_column_time = TRUE)
observed_all_X <- block_lnlp_output_YX$model_output[[1]]$obs
predicted_all_X <- block_lnlp_output_YX$model_output[[1]]$pred
pred_obs_X<-as.data.frame(cbind(predicted_all_X,observed_all_X))
colnames(pred_obs_X)<-c('Predicted X','Observed X')
head(pred_obs_X)## Predicted X Observed X
## 1 0.7513307 0.8451417
## 2 0.4587703 0.4934071
## 3 0.9306259 0.9423361
## 4 0.2179312 0.2048571
## 5 0.6259113 0.6140977
## 6 0.9040863 0.8934210
Below is the plot for all predictions of X (mapped from Y: Y_xmap_X) against the real observations, plus the Pearson’s correlation coefficient. The red dot represents the predictor (t=70) used in the example above.
Note that the Pearson correlation of Y_xmap_X (r = 0.68) is different than the X_xmap_Y (r = -0.08)
But what causes what?
The Pearson correlation of Y_xmap_X is different than the X_xmap_Y. When we used Y to estimate (map) X we obtained good predictions (r = 0.68). However, when we used X to estimate (map) Y, the correlation between estimate and observed was quite low (r = -0.08). This means that there is information about X dynamics inside the Y dynamics, therefore we can use Y to estimate X. Hence, if there is information of X within Y, this means that X somehow influences and causes Y. Indeed, since all our data is synthetic, we know this is true: we created the variable X independently of Y, and X was causing changes in Y. So when X causes Y, the cross mapping skill from \(M_Y\) to \(M_X\) (Y_xmap_X) will generally gives us good results (high correlations), but not X_xmap_Y. It is very important to notice that there is an inverse relation between cross mapping and causality: if X causes Y (X –> Y), Y cross map of X (Y_xmap_X) shows positive correlation.
Convergent Cross Mapping (CCM)
Convergence means that the estimates from the cross-mapping improve with increasing time-series length. The length L is the sample size used to construct a library.
This means that the more data you have (the larger the L), the more trajectories you have to define the attractor, resulting thus in closer nearest neighbors and less estimation error (i.e. a higher correlation coefficient between estimated and observed). The convergence of the cross mapping (CCM) is then a necessary condition for causation. [MARINA: this last sentence need further explanation of why]
Exercises
What happens when there is an invertion in cause-effect relation? Change the model so that the Y variable causes changes in X.
What if there is no interaction between X and Y?
What about both variables interacting with each other (X causing X and Y causing X)?
Identifying causal networks within systems is very important for effective policy and management recommendations on e.g. climate, ecosystem services, epidemiology and financial regulation. In the following exercise, you should use CCM to identify causality between anchovy landings in California and Newport Pier sea-surface temperature.
Learn more
[Mesmas referencias dos simplex?] * Sugihara G. and R. M. May. 1990. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature 344:734–741. * Sugihara G. 1994. Nonlinear forecasting for the classification of natural time series. Philosophical Transactions: Physical Sciences and Engineering 348:477–495. * Anderson C. & Sugihara G. Simplex projection - Order out the chaos. http://deepeco.ucsd.edu/simplex/ * “Simplex projection walkthrough”, a tutorial by Owen Petchey.
Taken from [rEDM vignette(https://cran.r-project.org/web/packages/rEDM/vignettes/rEDM-tutorial.html) section (“Causality Inference and Cross Mapping”)↩
This section is adapted from: Sugihara et al., Detecting Causality in Complex Ecosystems (Supplementary Materials), Science, 2012.↩